AITopics | gradient function

Collaborating Authors

gradient function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Storchastic: AFrameworkfor GeneralStochasticAutomaticDifferentiation

Neural Information Processing SystemsFeb-8-2026, 08:14:47 GMT

Modelers use automatic differentiation (AD) of computation graphs to implement complex deep learning models without defining gradient computations.

artificial intelligence, machine learning, storchastic, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > France (0.04)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Recent Advances in Non-convex Smoothness Conditions and Applicability to Deep Linear Neural Networks

Patel, Vivak, Varner, Christian

arXiv.org Artificial IntelligenceSep-20-2024

The presence of non-convexity in smooth optimization problems arising from deep learning have sparked new smoothness conditions in the literature and corresponding convergence analyses. We discuss these smoothness conditions, order them, provide conditions for determining whether they hold, and evaluate their applicability to training a deep linear neural network for binary classification.

globally lipschitz, gradient function, smoothness condition, (13 more...)

arXiv.org Artificial Intelligence

2409.13672

Country:

North America > United States > Wisconsin (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Hampshire > Hillsborough County > Nashua (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

A Differentiable Approach to Multi-scale Brain Modeling

Wang, Chaoming, Lyu, Muyang, Zhang, Tianqiu, He, Sichao, Wu, Si

arXiv.org Artificial IntelligenceJul-1-2024

We present a multi-scale differentiable brain modeling workflow utilizing BrainPy, a unique differentiable brain simulator that combines accurate brain simulation with powerful gradient-based optimization. We leverage this capability of BrainPy across different brain scales. At the single-neuron level, we implement differentiable neuron models and employ gradient methods to optimize their fit to electrophysiological data. On the network level, we incorporate connectomic data to construct biologically constrained network models. Finally, to replicate animal behavior, we train these models on cognitive tasks using gradient-based learning rules. Experiments demonstrate that our approach achieves superior performance and speed in fitting generalized leaky integrate-and-fire and Hodgkin-Huxley single neuron models. Additionally, training a biologically-informed network of excitatory and inhibitory spiking neurons on working memory tasks successfully replicates observed neural activity and synaptic weight distributions. Overall, our differentiable multi-scale simulation approach offers a promising tool to bridge neuroscience data across electrophysiological, anatomical, and behavioral scales.

differentiable approach, multi-scale brain modeling, neuron, (14 more...)

arXiv.org Artificial Intelligence

2406.19708

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.68)

Add feedback

Trapezoidal Gradient Descent for Effective Reinforcement Learning in Spiking Networks

Pan, Yuhao, Wang, Xiucheng, Cheng, Nan, Qiu, Qi

arXiv.org Artificial IntelligenceJun-19-2024

With the rapid development of artificial intelligence technology, the field of reinforcement learning has continuously achieved breakthroughs in both theory and practice. However, traditional reinforcement learning algorithms often entail high energy consumption during interactions with the environment. Spiking Neural Network (SNN), with their low energy consumption characteristics and performance comparable to deep neural networks, have garnered widespread attention. To reduce the energy consumption of practical applications of reinforcement learning, researchers have successively proposed the Pop-SAN and MDC-SAN algorithms. Nonetheless, these algorithms use rectangular functions to approximate the spike network during the training process, resulting in low sensitivity, thus indicating room for improvement in the training effectiveness of SNN. Based on this, we propose a trapezoidal approximation gradient method to replace the spike network, which not only preserves the original stable learning state but also enhances the model's adaptability and response sensitivity under various signal dynamics. Simulation results show that the improved algorithm, using the trapezoidal approximation gradient to replace the spike network, achieves better convergence speed and performance compared to the original algorithm and demonstrates good training stability.

actor network, gradient, neuron, (16 more...)

arXiv.org Artificial Intelligence

2406.13568

Country: Asia > China > Shaanxi Province > Xi'an (0.05)

Genre: Research Report (1.00)

Industry: Energy (0.78)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.42)

Add feedback

Gradient Networks

Chaudhari, Shreyas, Pranav, Srinivasa, Moura, José M. F.

arXiv.org Artificial IntelligenceApr-10-2024

Directly parameterizing and learning gradients of functions has widespread significance, with specific applications in optimization, generative modeling, and optimal transport. This paper introduces gradient networks (GradNets): novel neural network architectures that parameterize gradients of various function classes. GradNets exhibit specialized architectural constraints that ensure correspondence to gradient functions. We provide a comprehensive GradNet design framework that includes methods for transforming GradNets into monotone gradient networks (mGradNets), which are guaranteed to represent gradients of convex functions. We establish the approximation capabilities of the proposed GradNet and mGradNet. Our results demonstrate that these networks universally approximate the gradients of (convex) functions. Furthermore, these networks can be customized to correspond to specific spaces of (monotone) gradient functions, including gradients of transformed sums of (convex) ridge functions. Our analysis leads to two distinct GradNet architectures, GradNet-C and GradNet-M, and we describe the corresponding monotone versions, mGradNet-C and mGradNet-M. Our empirical results show that these architectures offer efficient parameterizations and outperform popular methods in gradient field learning tasks.

gradient, mgradnet, neural network, (14 more...)

arXiv.org Artificial Intelligence

2404.07361

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Storchastic: A Framework for General Stochastic Automatic Differentiation

van Krieken, Emile, Tomczak, Jakub M., Teije, Annette ten

arXiv.org Artificial IntelligenceApr-1-2021

Modelers use automatic differentiation of computation graphs to implement complex Deep Learning models without defining gradient computations. However, modelers often use sampling methods to estimate intractable expectations such as in Reinforcement Learning and Variational Inference. Current methods for estimating gradients through these sampling steps are limited: They are either only applicable to continuous random variables and differentiable functions, or can only use simple but high variance score-function estimators. To overcome these limitations, we introduce Storchastic, a new framework for automatic differentiation of stochastic computation graphs. Storchastic allows the modeler to choose from a wide variety of gradient estimation methods at each sampling step, to optimally reduce the variance of the gradient estimates. Furthermore, Storchastic is provably unbiased for estimation of any-order gradients, and generalizes variance reduction techniques to higher-order gradient estimates. Finally, we implement Storchastic as a PyTorch library.

control variate, estimator, storchastic, (13 more...)

arXiv.org Artificial Intelligence

2104.00428

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(17 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Symmetrical Gaussian Error Linear Units (SGELUs)

Yu, Chao, Su, Zhiguo

arXiv.org Machine LearningNov-10-2019

In this paper, a novel neural network activation function, called Symmetrical Gaussian Error Linear Unit (SGELU), is proposed to obtain high performance. It is achieved by effectively integrating the property of the stochastic regularizer in the Gaussian Error Linear Unit (GELU) with the symmetrical characteristics. Combining with these two merits, the proposed unit introduces the capability of the bidirection convergence to successfully optimize the network without the gradient diminishing problem. The evaluations of SGELU against GELU and Linearly Scaled Hyperbolic Tangent (LiSHT) have been carried out on MNIST classification and MNIST auto-encoder, which provide great validations in terms of the performance, the convergence rate among these applications.

activation function, sgelu, symmetrical gaussian error linear unit, (10 more...)

arXiv.org Machine Learning

1911.03925

Country: Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Regularized Ensembles and Transferability in Adversarial Learning

Chen, Yifan, Vorobeychik, Yevgeniy

arXiv.org Machine LearningDec-5-2018

Despite the considerable success of convolutional neural networks in a broad array of domains, recent research has shown these to be vulnerable to small adversarial perturbations, commonly known as adversarial examples. Moreover, such examples have shown to be remarkably portable, or transferable, from one model to another, enabling highly successful black-box attacks. We explore this issue of transferability and robustness from two dimensions: first, considering the impact of conventional $l_p$ regularization as well as replacing the top layer with a linear support vector machine (SVM), and second, the value of combining regularized models into an ensemble. We show that models trained with different regularizers present barriers to transferability, as does partial information about the models comprising the ensemble.

artificial intelligence, ensemble model, machine learning, (16 more...)

arXiv.org Machine Learning

1812.01821

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)

Add feedback

Scalable kernel-based variable selection with sparsistency

He, Xin, Wang, Junhui, Lv, Shaogao

arXiv.org Machine LearningFeb-27-2018

Variable selection is central to high-dimensional data analysis, and various algorithms have been developed. Ideally, a variable selection algorithm shall be flexible, scalable, and with theoretical guarantee, yet most existing algorithms cannot attain these properties at the same time. In this article, a three-step variable selection algorithm is developed, involving kernel-based estimation of the regression function and its gradient functions as well as a hard thresholding. Its key advantage is that it assumes no explicit model assumption, admits general predictor effects, allows for scalable computation, and attains desirable asymptotic sparsistency. The proposed algorithm can be adapted to any reproducing kernel Hilbert space (RKHS) with different kernel functions, and can be extended to interaction selection with slight modification. Its computational cost is only linear in the data dimension, and can be further improved through parallel computing. The sparsistency of the proposed algorithm is established for general RKHS under mild conditions, including linear and Gaussian kernels as special cases. Its effectiveness is also supported by a variety of simulated and real examples.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1802.09246

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

Add feedback

Filters

Collaborating Authors

gradient function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

3dfe2f633108d604df160cd1b01710db-Paper.pdf

Storchastic: AFrameworkfor GeneralStochasticAutomaticDifferentiation

Recent Advances in Non-convex Smoothness Conditions and Applicability to Deep Linear Neural Networks

A Differentiable Approach to Multi-scale Brain Modeling

Trapezoidal Gradient Descent for Effective Reinforcement Learning in Spiking Networks

Gradient Networks

Storchastic: A Framework for General Stochastic Automatic Differentiation

Symmetrical Gaussian Error Linear Units (SGELUs)

Regularized Ensembles and Transferability in Adversarial Learning

Scalable kernel-based variable selection with sparsistency